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i^ ABSTRACT 

^^ We present a fully probabilistic, physical model of the non-linearly evolved density field, as 

\^ probed by realistic galaxy surveys. Our model is valid in the linear and mildly non-linear 

1-H regimes and uses second order Lagrangian perturbation theory to connect the initial condi- 

tions with the final density field. Our parameter space consists of the 3D initial density field 
and our method allows a fully Bayesian exploration of the sets of initial conditions that are 
consistent with the galaxy distribution sampling the final density field. A natural byproduct 
\^ of this technique is an optimal non-linear reconstruction of the present density and velocity 

_7 fields, including a full propagation of the observational uncertainties. A test of these meth- 

f^ ^ ods on simulated data mimicking the survey mask, selection function and galaxy number of 

I the SDSS DR7 main sample shows that this physical model gives accurate reconstructions of 

O the underlying present-day density and velocity fields on scales larger than ~ 6 Mpc/h. Our 

^ method naturally and accurately reconstructs non-linear features corresponding to three-point 

^ and higher order correlation functions such as walls and filaments. Simple tests of the recon- 

I I structed initial conditions show statistical consistency with the Gaussian simulation inputs. 

Our test demonstrates that statistical approaches based on physical models of the large scale 
^-H structure distribution are now becoming feasible for realistic current and future surveys. 

> 

(^\ Key words: large scale - galaxy surveys - reconstruction -Bayesian inference 
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m 1 INTRODUCTION AND MOTIVATION acoustic oscillations of photon-baryon plasma. Within the standard 

w5 cosmology, the evolution and growth of the initial perturbations in 

rNl Ongoing and planned Large Scale Structure (LSS) surveys will ,. ^t ■ ■ n j » j ■ •■ i j j- .i 

^ ^ " 1= t- o \ r J ^jj expandmg Universe is well-understood m principle, and directly 

T-H measure the distribution of galaxies at an unprecendented level i- i j . ■. j , ,-, . u j i »» j j i 

. . Of linked to its dominant constituents such as dark matter and dark en- 

>of accuracy in the coming decade. These surveys are expected to t» .u f . i » i t cc j- ,i ■ 

JO J f ergy. It therefore seems natural to analyse LSS surveys directly in 

. ,— I vastly enhance our constraints on the physics of cosmogenesis,neu- ^ ex. ■ ^^ . ■ . »u i .u ■ ■.■ i j 

)/~j . •' . ^ -^ o ' terms ot the simultaneous constraints they place on the initial den- 

rS P y- ' &y P Sy- ^ijy ggjjj ^jj^j jjje physical evolution that links the initial density 

^ How do we compare cosmological models to these surveys? g^j^ ^^ ^^^ observed tracers of the evolved density field. 

^ We have an observationally well-supported physical model of the 

initial conditions. According to this model, a homogeneous and Fo^ a variety of good reasons the current state of the art of 

isotropic density field with small, very nearly Gaussian, and nearly statistical analyses of LSS surveys is far removed from this ideal. 

scale-invariant correlated density perturbations arose from quan- There are some areas where significant progress seems very dif- 

tum perturbations in the very early Universe. Gravitational evolu- ^'^"1'- I" Particular, a detailed physical model of the way galax- 

tion in an expanding background processed these initial conditions ''^^ arise in response to the spatial fluctuations in the dark matter 

into an evolved density field, at first through hnear transfer and then distribution is not computationally tractable (the "bias" problem). 

through non-linear structure formation. LSS surveys catalogue the Even for the dark matter alone, reversing the non-linear evolution 

positions of observed tracers of this evolved density field in redshift that link the initial and e volved density field is a fundamentally 

ill-posed problem (see e.g. |Nusser & Dekel|1992[|Crocce & Scoc-| 

It is now standard to model the initial Gaussian density pertur- |cimarro|2006^. 

bations statistically in terms of the early universe processes that ere- As a consequence, the state of the art in the analysis of galaxy 

ated them, such as the physics of inflation , the change from matter surveys addresses these problems in isolation. In the standard ap- 

to radiation dominated universe, neutrino free-streaming, and the proach, the link between theory and observation is made through 
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the power spectrum. This requires solving two separate problems: 
the data analysis problem of inferring the power spectrum from an 
observed sample of tracers given a survey mask and selection func- 
tion (see e.g.'Feldman et al.'1994iiTegmark & et al.|2004||Wandeh| 
erar, 20Q4: Eriksen et al. 2004; Percival 2005, |Jasche et al.|2010| 



Eisner & Wandelt||2012} ; and the much more difficult theoretical 



problem of modeling the power spectrum and the form of its like- 
lihood for the non-linearly evolved and biased galaxy density field 
(see e.g.[Baugh et al.'1995'. 'Peacock & Dodd s|1996] [Smith et al.] 
|2003[[Jeon g & Komatsu 2006 , Heitmann et al. 2 010[ and references 
therein). 

Three-dimensional inference of the matter distribution from 
observations requires modeling the statistical behavior of the 
mildly non-linear and non-linear regime of the matter distribution. 
The exact statistical behavior of the matter distribution in terms of 
a probability distribution for the fully evolved density field is not 
known. Previous approaches therefore relied on phenomenologi- 
cal approximations such as multivariate Gaussian or log-normal 
distributions incorporating a cosmological power-spectrum to ac- 
curately account for the correct two-point statistics of the density 
fields. Both of these distributions can be considered as maximum 
entropy prior on a linear and logarithmic scale, respectively, and 
are therefore well-justified for Bayesian analysis. However, these 
priors only parametrize the two-point statistics of the matter distri- 
bution. Since large scale structure formation through gravitational 
clustering is essentially a deterministic process described by Ein- 
stein's equations and since the only stochasticity in the problem 
enters in the generation of initial conditions, it seems reasonable 
to account for the increasing statistical complexity of the evolving 
matter distribution by a dynamical model. 

In this paper we describe progress towards such an approach 
that uses data to constrain a set of a priori possible dynamical, 
three-dimensional histories. We use second order Lagrangian per- 
turbation theory (2LPT) as a physical model of the gravitational 
dynamics that link the initial three-dimensional Gaussian density 
field to the observed, non-Gaussian density field. In Bayesian par- 
lance our prior for the evolved density is the initial Gaussian den- 
sity field evolved by a 2LPT model. Using the powerful sampling 
techniques recently developed by |Jasche& Kitaura (2010| we can 
use this model as prior information and explore the range of ini- 
tial Gaussian density fields that are statistically consistent with the 
data, modeled as a Poisson sample from evolved density fields. 

Our method will also automatically generate reconstructions 
of the large scale velocity field since our model incorporates dy- 
namics. Since the approach is implemented in a fully Bayesian 
framework we do not produce unique reconstructions, but a set of 
samples which can be interpreted as a probabilistic representation 
of the information the observations contain about the underlying 
density (initial and evolved) and the velocity field. In particular, the 
variations between samples represent the uncertainties that remain 
in the reconstruction owing to the modeled statistical and system- 
atic errors in the data. 



1.1 Comparison to prior work 

In the recent past several papers have pointed out the promise of the 
lognormal model in fitting to observations of the non-linear density 
field (see e.g. |Kitaura et al.|2010[ [Jasche & KitaurapOlO] [Jasche[ 
[et al.|20i0) . While the lognormal approach provides a good model 
of the 1 -point and 2-point functions of the field we will show that 
Gaussian statistics evolved by 2LPT reproduces those successes 
but, in addition, reproduces features naturally that are associated 



with the higher order n-point functions such as filaments and walls. 
This is not surprising since it is well known that 2LPT reproduces 
the exact one- and three-point statistics of fully non-linear density 
fields at large scales, and also approximates higher order statis- 
tics very well (see e.g. [Moutarde et al.|1991[[Buchert et al.|1994 



iBouchet et al. 1995 ; Scoc cimarro||200b| [Bernardeau et al.[]2002 
[Scoccimarro & Sheth 2002| ). 

The field of velocity field reconstructions has a long history 
(see e.g. 'Bertschinger et al."1990', 'Nusser & Dekel"1992'. 'Dekel 



etaL||1999, Frisch et al. 2002; Brenier et al. 2003; Mohayaee & 



Sobolevskii|2008|[Lavaux]2 008 ; Kitau ra et al.[201 1| >'. The contribu- 



tion of our approach is the imbedding of a non-Gaussian model in 
a probabilistic framework. Zel'dovich and MAK are, respectively, 
perturbative and non-perturbative attempts to reconstruct the dis- 
placement field linking the initial conditions from tracers of large 
scale structure and as such also generate estimates of the velocity 
field. Our approach goes beyond these works in several ways: we 
combine the inference with a detailed non-Gaussian model of real- 
istic survey features (mask, selection function and shot noise); we 
implement explicitly a Gaussian prior for the initial density field; 
and the Bayesian exploration gives a quantitative characterization 
of the uncertainties in our inferences. 

Significant effort has also been invested in establishing accu- 
rate representations of the observed Universe in numerical sim- 
ulations, by constraining simulations by observations (see e.g. 



Kravtsov et al.|2002[[Klypin et al.|2003|[Dolag et al.|2005[pb& 



'skind et al.'2010"Martine z-Vaquero et a .|2009[[Gottloeber et ah 
2010; Lavaux 2010). Many of these approaches rely on integrat- 
ing the observed present day density field backwards in time to the 
initial state. Such an approach is generally hindered due to incom- 
plete observations of the final state and by spurious erronous en- 
hancement of decaying mode power in the initial conditions during 
backward integration (Nusser & Dekel|1992). The fully probabal- 
istic approach, proposed in this work, naturally accounts for uncer- 
tainties of only partially observed final states, by exploring physi- 
cal reasonable solutions, filtered by the 2LPT model, for the initial 
conditions which can all lead to the same or similar final observa- 
tions. Furthermore, our method solely depends on forward evalu- 
ations of the model, which therefore accurately handels the issue 
of decaying mode power Also note that unique recovery of initial 
conditions is generally not possible on all scales due to the chaotic 
nature of the dynamical large scale structure formation process on 
small scales (see e.g. [Nusser & Dekel[1992[[Crocce & ScoccimarTo[ 
[2006[ >. These uncertainties will also be accurately accounted for by 
our method while exploiting information on the initial conditions 
on all scales accessible to the 2LPT model. 

The paper is structured as follows. In section[2]we discuss the 
design of posterior distributions for large scale structure inference 
and show that the complex problem of modeling accurate prior dis- 
tributions for the evolved non-Gaussian matter distribution can be 
recast as an initial conditions problem once a physical model for 
large scale structure formation is specified. Furthermore, we will 
present the resultant 2LPT-Poissonian posterior distribution for the 
inference of the three dimensional matter distribution from galaxy 
surveys. SectionlSlprovides a brief overview over the Hamiltonian 
sampling approach employed in the inference framework described 
in this work, and in section [4] we present the relevant derivations 
of the Hamiltonian forces required for an efficient numerical im- 
plementation of the Hybrid Monte Carlo sampler In the following 
section|6] we describe the generation of an artificial galaxy survey, 
inspired by the Sloan Digital Sky Survey data release 7 main sam- 
ple ([Abazajian et al.[2009[>. In section[7]we describe the application 
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of our method to this simulated data in order to provide a proof of 
concept and to estimate the behavior of the algorithm in a realistic 
setting. We will conclude the paper with a summary and a discus- 
sion of the results in section[8] 



formation. This process amounts to generating samples from the 
joint prior distribution of the final and initial conditions: 



msj], {s-,]) = ms;]) Y] ^°(^/ - G{a, 6-),) . 



(3) 



2 THE 2LPT-POISSONIAN POSTERIOR 

2.0.1 The non-Gaussian density prior 

As already pointed out in the introduction, inferring the three 
dimensional large scale structure from observations requires the 
design of suitable prior distributions for the fully gravitationally 
evolved density field. Standard approaches such as Wiener filtering 
employ Gaussian priors, which are assumed to be suitable for the 



inference of the largest scales(see e.g. Lahav 1994, Zaroubi 2002 
Erdogdu et al.|2004||Kitaura & EnBhn_2008,,Kitaura et al._20"09 



Jasche et al.||20 10) 7 For the inference of the density field in the 
non-linear regime log-normal priors have been proposed and suc- 
cessfully applied to large scale structure inference problems (|Ki 



[taura et al. 2010; Jasche & Kita ura|2010[ [Jasche et al. 20101. More 
recently, ^Kitaura_ ( ^20 12} proposed to use Edgeworth expansions to 
construct prior distributions incorporating also third order moments 
of the distribution. All of these approaches are based on heuristic 
approximations to the full problem. Currently, a closed form de- 
scription of the present day density field in terms of a multivariate 
probability distribution does not exist. 

While there exist considerable difiiculties in constructing a 
suitable probability distribution for the present day density field, 
the initial seed fluctuations at redshifts z ~ 1000 obey Gaussian 
statistics to great accuracy, in agreement with theories of infla- 
tion and observations (see e.g. |Linde|2008[|Komatsu et al.|2011| ). 
Therefore, the complicated nature of the present matter distribu- 
tion solely originates from deterministic physical processes during 
structure formation. Generally, gravitational interactions introduce 
mode coupling and phase correlations, such that the statistical be- 
havior of the present day density strongly deviate from a Gaussian 
distribution (see e.g. |Peacock|1999^ . 

Since initial and final conditions are linked via deterministic 
structure formation processes, it seems reasonable to formulate the 
inference problem in terms of simpler statistics at the initial con- 
ditions, rather than approximating the complex statistical behavior 
of the non-linear matter distribution. More specifically, given a suit- 
able model of large scale structure formation G{a, 6') we can obtain 
a prior distribution for the final density contrast 5' for a given cos- 
mic scale factor a by marginalizing over the initial conditions: 



niSil) = jd{6-]P({6{] 



[s-,]) 



I 



d{d;]'P{{s;])n{si]\{6',]), 



(1) 



where, for a deterministic structure formation model, the condi- 
tional probability is given by Dirac delta distributions: 



msimS-])) = Y] S°{6{ - G{a, S-)i) . 



(2) 



Given a model G{a, S') for structure formation, a prior distribution 
for the present day density field can be obtained by a two step sam- 
pling process, by first generating an initial conditions realization 
from the prior distribution 'P({(5J)) and then propagating the initial 
state forward in time with a suitable model of large scale structure 



By discarding the initial density realization, one obtains a realiza- 
tion from the prior distribution for the non-linear final state. Assum- 
ing, a multivariate Gaussian process with zero mean and covariance 
matrix S to generate the initial density field 6' the joint prior distri- 
bution is given as: 

mS-,], {S{]\S)) = P{16-,]\S) Y] 6" (<5f - G{a, 6')) 



£-7 E/m ■s;i',,>;„ 

det(2;r5) 



f](5°(^/-G(fl,n) 



(4) 



In this work, we will employ a second order Lagrangian per- 
turbation theory (2LPT) model to approximately describe gravita- 
tional large scale structure formation (also see appendix [B] for an 
overview over the 2LPT model). 2LPT is able to recover the ex- 
act one-, two- and three-point statistics at large scales, and also 



approximates higher order statistics very well (see e.g. Moutarde 



et al. 1991 ; Buchert et al. 1994; Bouchet et al.|1995[|Scoccimarro 
2000, Scoccimar ro & Sheth 2002) . The 2LPT model therefore nat- 
urally reproduces physically reasonable higher order statistics in 
the matter inference problem without requiring the introduction of 
additional parameters for the description of higher order statistics. 
Our approach therefore provides a physically meaningful way of 
matching the higher order statistics of the evolved density field 
while requiring no further knowledge other than the initial two- 
point statistics. 



2.0.2 The large scale structure likelihood 

Above we demonstrated that the task of modeling accurate prior 
distributions for the statistical behavior of the present day matter 
distribution can be recast into an initial conditions inference prob- 
lem once a model for large scale structure formation is specified. 

The methods described in this work are general and can be 
adapted for the inference from any particular probe of the three 
dimensional large scale structure. We will illustrate our method for 
the case of optical galaxy redshift surveys. 

Galaxies tend to follow the gravitational potential of the cos- 
mic matter distribution and thus can be considered as matter tracers. 
The statistical uncertainty due to the discrete nature of the galaxy 
distribution can be modeled as an inhomogeneous Poisson processe 
(see e.g. |Layzer| 1956; Peebles 1980, Marti nez & Saar 2002l. Also 
note that Poissonian likelihoods have akeady been successfully em- 
ployed for non-linear density field inference (see e.g. |Kitaura et al.| 
[20Tbl [Jasche & Kitaura|20T0i [Jasche et al.|2010| for details). Fol- 
lowing this approach, the corresponding Poissonian likelihood dis- 
tribution can be expressed as: 



n{Ni\u,\) = Y[ 



Nl\ 



(5) 



where N^ is the observed galaxy number at position x^ in the sky 
and At is the expected number of galaxies at this position. The mean 
galaxy number is related to the final density field Sf, via: 



At = At(6)=RtN{l + B(5f)t), 



(6) 
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where /?j. is a linear response operator, incorporating survey geome- 
tries and selection effects, N is the mean number of galaxies in the 
volume and B(x}i, is a nonlinear, non local, bias operator at position 
j^k (also see [Jasche & Kitaura|2010[ [Jasche et al.|2010| for further 
discussions). 

The joint posterior distribution for the initial conditions (5J and 
the final density field i5, given the galaxy observations is then ob- 
tained by the multiplying equation (HI and (pi: 

Pi{s;]As{]\{N,],s) = 



det(2;r5) 



Y]6''{6{-G(a,n) 



n 



msf)f 



-Ai,(6f) 



Nil 



(7) 



We see that given a model of structure formation G(a, 6'), the final 
density field Sj is a free byproduct of the inference process. Conse- 
quently, marginalizing equation il i over 6' yields the desired target 
posterior distribution for large scale structure inference: 



>{i6',]\m,s) 



-7 1.1m ^'1^1,,,^, 

det(2nS) 



n 



(it(G(a,5')))'''e-'''(^<'''''>) 



iVf! 



(8) 



This result requires several remarks. First, A nearly trivial, but nev- 
ertheless important, conclusion to draw from (Isj is that large scale 
structure inference depends solely on the initial conditions S'j. Con- 
sequently, the complex task of designing suitable prior distributions 
for the inference of the evolved density field can be recast into an 
initial value problem by modeling a suitable physical model to ac- 
count for the complexity of the final state. 

Second, note that inferring the initial density field does not in- 
volve backward in time integration of the physical model. The in- 
ference process exclusively requires model evaluations in the for- 
ward time direction, counter to the widely held notion that infer- 
ence of initial conditions requires backward integration of the equa- 
tions of motion. Nevertheless, traditional approaches of initial con- 
ditions inference, also known as 'time machines', rely on the inver- 
sion of the flow of time in the model equations (see e.g. [Nusser &[ 
[Dekel 19921. As pointed out by [Nusser & Dekel[ (|1992l, the dis- 
advantage of backward integration is that it may lead to artificial 
increase of decaying modes amplitudes introducing erroneous arti- 
ficial density and velocity fluctuations in the initial conditions. Also 
note that large scale structure surveys only provide limited informa- 
tion on the full final state, due to survey geometries and statistical 
uncertainties. These problems generally hinder a unique backward 
integration of the partially observed final state to the initial condi- 
tions. 

To alleviate this problem, and to ensure physical meaningful 
backward integration of non-linear models, one has to augment un- 
observed regions in the data with statistically meaningful informa- 
tion mimicking the unobserved part of the evolved density field. 
This in turn requires accurate knowledge on the multivariate prob- 
ability distribution for the evolved final state (5, , which is not known 
at present. 

Such problems are naturally prevented by casting the recon- 
struction of initial conditions as the statistical inference problem 
expressed in equation (Isl. Since the proposed method solely de- 
pends on forward model evaluations, unobserved regions and sta- 
tistical uncertainties can be easily dealt with in the initial condi- 



tions, which amounts to modeling simple, uncorrelated Gaussian 
processes. Further, the posterior distribution proposed in equation 
(Isl accounts for systematics, such as survey geometry, selection 
efi'ects and biases but also for statistical uncertainties such as the 
noise of the galaxy distribution and cosmic variance. 

We therefore see that statistical uncertainties do not allow a 
unique inference of the initial conditions from partially observed fi- 
nal states. Consequently, our approach aims at exploring the highly 
non-Gaussian and non-linear posterior distribution 'P(j(5J)|jA',j,5 j 
of the initial density field (5J conditional on galaxy observations 
A'; in order to quantify the uncertainty and significance of the in- 
ferred initial conditions. The overall inference process is numer- 
ically non-trivial. It is made possible by sophisticated non-linear 
analysis methods, which will be described in the following. 



3 HAMILTONIAN SAMPLING 

As described in the previous section, exploration of the initial con- 
ditions posterior distribution requires numerically efficient Markov 
Chain Monte Carlo methods to accurately account for all non- 
linearities and non-Gaussianities involved in the inference process. 
Unfortunately, unlike as in the Gibbs sampling approach for large 
scale structure proposed in [Jasche et al.| j2010[ l, direct sampling 
from this posterior is not possible. One therefore has to rely on a 
sampling procedures with an accept-reject step for the exploration 
of the high dimensional parameter space encountered in this prob- 
lem. A major draw back of traditional algorithms of this type, e.g. 
Metropolis-Hastings, is their dominant random walk behavior and 
a possible high rejection rate if no suitable proposal distribution can 
be designed. In this work, we usually deal with about 10*, or more, 
free parameters 5J which correspond to the initial density contrast 
amplitudes at the volume elements of the analyzed volume. Due 
to this high dimensionality of the problem, a low acceptance rate 
of the Metropolis-Hastings algorithm would result in a prohibitive 
computational cost for the method. Given this situation, we propose 
to use a Hybrid Monte Carlo (HMC) method, which in the absence 
of numerical errors, would yield an acceptance rate of unity. The 
HMC method exploits techniques developed to follow classical dy- 
namical particle motion in potentials ( [Duaneetal.|1987 ; Neal 19931 
[1996[ >. In this fashion the Markov sampler follows a persistent mo- 
tion through the parameter space, suppressing the random walk be- 
havior. This enables us to sample with reasonable efficiency in high 
dimensional spaces ( Hanson 2001 1. Furthermore, the HMC has al- 
ready been successfully applied to non-linear large scale structure 
inference problems (see e.g. [Jasche & Kitaura|2010{ [Jasche et al.[ 
[20T0| >. 

In the following, we will just briefly outline the basic idea 
of the hybrid Hamiltonian sampling algorithm. More detailed de- 
scription of the algorithm and its application in large scale structure 
inference and in general can be found in ( Duane et al .|1987 Neal[ 
,1 993 „Hanson,2001„ Jasche & Kitaura|20 10,, Jasche et al.|2010| [ 



3.1 The HMC 

Suppose, we wish to generate samples from a probability distribu- 
tion P({xj]), where jx, j is a set consisting of the N elements Xj. If 
we interpret the negative logarithm of this posterior distribution as 
a potential: 



t//(x) = -biCPix)) , 



(9) 
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J. completeness 

Figure 1. Selection function f(z) as a function of redshift z (left panels) and the two dimensional completeness map for the SDSS DR7 (right panel). 



and by introducing a 'momentum' variable p, and a 'mass matrix' 
M, as nuisance parameters, we can formulate a Hamiltonian de- 
scribing the dynamics in the multi dimensional phase space. Such 
a Hamiltonian is then given as: 

2' 



H 



ZjZj^P-^7iPj + ^(^)- 



(10) 



As can be seen in equation (Tol, the form of the Hamiltonian is such 
that the joint distribution is separable into a Gaussian distribution 
in the momenta jp,j and the target distribution 'P(jx,)) as: 



e-"=r({x,])e-^-^^^'"-''u 



(11) 



For this reason, marginalization over all momenta will again yield 
the original target distribution 'P(jx, j). 

In order to generate a random variate from this joint distri- 
bution, being proportional to exp(-H), one first draws a set of 
momenta from the distribution defined by the kinetic energy term 
that is an A' dimensional Gaussian with a covariance matrix M. 
Then one follows the deterministic dynamical evolution of the sys- 
tem, given a starting point ({j,j, (p,)) in phase space for some fixed 
pseudo time t according to Hamilton's equations: 

dXi dH 
dt dpi ' 
dpi _dH __ dip{x) 
dxi 



(12) 



(13) 



dt dxi dxj 

The integration of this equations of motion yields the new position 
({JC;}. {p'i}) in phase space. This new point is accepted according to 
the usual acceptance rule: 



Pa = min [1, exp(- {H{{x\], {p\]) - H{{x,], [p,]))] 



(14) 



Since the equations of motion provide a solution to a Hamiltonian 
system, energy or the Hamiltonian given in equation \iQ) is con- 
served, and therefore the solution to this system provides an ac- 
ceptance rate of unity. In practice, numerical errors can lead to a 
somewhat lower acceptance rate. Samples of the desired target dis- 
tribution are then obtained by simply discarding the auxiliary mo- 
menta {p/j, which amounts to marginalization over these nuisance 
parameters. The particular implementation of the hybrid Hamilto- 
nian Monte Carlo sampler for the problem described in this work 
will be discussed below. 



4 EQUATIONS OF MOTION FOR LSS INFERENCE 

As described above, the HMC approach permits to explore the non- 
linear large scale structure posterior by following Hamiltonian dy- 
namics in the high dimensional parameter space. The correspond- 
ing forces, required to evaluate these Hamiltonian trajectories can 
be derived from the large scale structure posterior given in equation 
llsl. Consequently, the Hamiltonian potential 1'({(5Jj) can be written 
as: 

T({<5;i) = -ln(n{5;il{Af,l,5)) 

= ^prioAiSj}) + ^ likelihood({S i}) , 



with the potential T,,„or({<5)l) is given as: 

and 'VnkeUhood{{6\]) is given as: 
"VukeUhoodm) = Yj ^*^'?''' ( 1 + G(a, 5%) 

k 

-Nk\n[RtN,,,(l+G(a,S\)), 



(15) 



(16) 



(17) 



Given the above definition of the Hamiltonian potential 1'(j(5J)) one 
can obtain the required Hamiltonian forces by differentiating with 
respect to 6' : 



d6- 



d'V„„,,{{S\}) d'V, 



A[S\}) 



dS- 



Here the prior term is given by: 



dS' 



d'i'pr.oraSi]) 

dSL 



Y^S^^r 



(18) 



(19) 



In contrast the likelihood term cannot be derived as trivially. A de- 
tailed derivation for the likelihood term can be found in Appendix 
[d| The likelihood term ^ ukeiihood(\6'i])) can be expressed as: 



a^,, 



MS)\) 



06' 



-£»' J,, + D^Y (■^r** + ■^f"" ~ ^t;*"*) 



(20) 
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Figure 2. The plot demonstrates the initial burn-in drift of successive power-spectra, measured from the initial density fields, towards the true underlying 
solution. Successive samples are color coded corresponding to their sample number as indicated by the color bar on the right. Black dashed lines correspond to 
the true underlying values. Lower panels depict the successive deviation f from the true values, as described in the text, for the measured power-spectra. The 
sequence of 800 successive samples, visualizes how the sampler approaches the true underlying values and starts exploring the pai'ameter space around them. 



where the vector 7,, and the tensor r° 



are defined in Appendixpl 
Finally, the equations of motion for the Hamiltonian system 
given in equations jl2|( and \13\ can be written as: 



dSi, 
dt 

and 

dpn 

dt 



=z 



MZ 



Pj' 



(21) 



■J]s;jd) + D'j„ + D'J]{T:"' 



■ 2Tf '=*) . (22) 



New samples from the large scale structure posterior can then be 
obtained by following the dynamical evolution of the Hamiltonian 
system in phase space. 



5 NUMERICAL IMPLEMENTATION 

We named our numerical implementation of the initial conditions 
sampler BORG (Bayesian Origin Reconstruction from Galaxies). 
It utilizes the FFTW3 library for Fast Fourier Transforms and 
the GNU scientific library (gsl) for random number generation 
( |Frigo & Johnson|2005[|Galassi et al.|200 3). In particular, we use 
the Mersenne Twister MT19937, with 32-bit word length, as pro- 
vided by the gsl_mg_mtl9937 routine, which was particularly de- 
signed for Markov Chain Monte Carlo simulations ( [Matsumoto &| 
|Nishimura|l998l l. 



5.1 The leapfrog scheme 

The numerical implementation of the HMC sampler employed in 
this work largely follows the implementation described in [Jasche] 
|& Kitaura|(|2010[>. Generally the numerical integration scheme is 



required to meet some conditions in order to achieve optimal ef- 
ficiency of the sampler. High acceptance rates require the numer- 
ical integration scheme to be highly accurate in order to conserve 
the total Hamiltonian. Low accuracy in the integration scheme will 
generally lower the acceptance rate. Additionally, the integrator 
must be symplectic, meaning exactly reversible, in order to ensure 
the Markov Chain satisfies detailed balance ( [Duane et al.||1987] ). 
For this reason, we implemented the leapfrog scheme to integrate 
the Hamiltonian system. It is also numerically robust, and allows 
for simple propagation of errors. In particular, we implement the 
Kick-Drift-Kick scheme. The equations of motions are integrated 
by making n steps with a finite step size e, such that t = ne: 



P.„(r^^)- 



Pm{t) - 



d^m) 



dd'i 



<5™(') 



S',„(t + e) = 6;„(t)- 



hi)- 



(23) 



(24) 



P,„ (t+e) = p, 



hi)- 



dii^as-]) 



86', 



(25) 

4,('+e) 

We iterate these equations until t = t. Also note that it is important 
to vary the pseudo time interval t, to avoid resonant trajectories. We 
do so by drawing n and e randomly from a uniform distribution. 

5.2 Hamiltonian mass 

The HMC possesses a large number of tunable parameters con- 
tained in the 'mass' matrix M which have an important effect on 
the performance of the sampler. The Hamiltonian mass defines the 
inertia of individual parameters when moving through the param- 
eter space. Consequently, too large masses will result in slow ex- 
ploration efficiency, while too light masses will result in large nu- 
merical errors of the integration scheme leading to high rejection 
rates. 
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Figure 3. Acceptance rates for successive samples (left panel) and the execution time per sample (right panel). It can be seen that the acceptance rates drops 
during the initial burn in phase and finally stabilizes at about 84 per cent. The left panel demonstrates the scatter in the execution times of individual samples. 
The average execution time is about 300 seconds as indicated by the solid blue line. 



Generally, if the individual (5J would be Gaussian distributed, 
a good choice for HMC masses is to set them inversely propor- 
tional to the variance of that specific (5J iJTaylor et al.||2008|. For 
non-Gaussian distributions it is reasonable to use some measure of 
the width of the distribution ( [Taylor et al.|2008| . For example, |Neal| 
( |1996) proposes to use the curvature at the peak. A suitable ap- 
proach to define Hamiltonian masses is to perform an approximate 
stabihty analysis of the numerical leapfrog scheme (see e.g. |Taylor| 
|et al.|2008{ [Jasche & Kitaura|2010^ . Following this approach, we 
will expand the Hamiltonian forces, given in equation fTSj, around 
a mean signal fj, which is assumed to be the mean initial density 
contrast once the sampler moved beyond the burn-in phase. As de- 
scribed in Appendix IF] approximating the Hamiltonian forces to 
linear order amounts to approximating the target distribution by a 
Gaussian distribution. According to the discussion in Appendix |F| 
the Hamiltonian masses should be set as: 



M,„ 



■ ■-> mj "mj ^ 



dJjis) 



ds, 



(26) 



where Jj is defined in Appendix [D| Calculation of the leapfrog 
scheme requires inversions of M. Considering the high dimension- 
ality of the problem, inverting and storing M"' is computationally 
impractical. For this reason, we construct a diagonal 'mass matrix' 
from equation \26\ . We found that choosing the diagonal of M, as 
given in equation l|26|), in its Fourier basis yields faster convergence 
for the sampler than a real space representation since it accounts for 
the correlation structure of the underlying density field. 



6 GENERATING MOCK OBSERVATIONS 

In the previous sections we presented the derivation and the im- 
plementation of our method. Here we will describe the generation 
of mock data sets that will be used to test our method. Following 
closely the description in [Jasche & Kitaura] ^2010) , we will first 
generate a realization for the density contrast 6] from a normal dis- 
tribution with zero mean and a covariance matrix corresponding to 
a cosmological power-spectrum in a a three dimensional Cartesian 
box with Nsjje = 128, corresponding to A',,oa- = 2097152 volume 



elements, and a co-moving box length of L = 750Mpc/;"'. The 
density field will then be scaled back to an initial time correspond- 
ing to a cosmological scale factor «,„,, = 0.001 by multiplication 
with a cosmological growth factor £)^(a„,„), described in appendix 
[a] In particular, we use a cosmological power- spectrum for the un- 
derlying matter distribution, with baryonic wiggles, calculated ac- 
cording to the prescription described in Eisenstein & Hu[([1998f and 
[Eisenstein & Hu[ ( |l999[ ). We assume a standard ACDM cosmology 
with a set of cosmological parameters (n„, = 0.22, fl^ = 0.78, 
D,, = 0.04, h = 0.702, erg = 0.807, n, = 0.961 ). The Gaussian 
initial conditions are then propagated forward in time using sec- 
ond order Lagrangian perturbation theory as described in appendix 
[B] From the resultant particle distribution the final density contrast 
field dj is constructed via the cloud in cell (CIC) method (see e.g. 
[Hockney & Eastwood|1988) . 

An artificial galaxy catalog is then generated by simulating 
the inhomogeneous Poisson process given by equation (|5j on top 
of the final density field (5, . In order to set up a realistic testing 
environment, we seek to emulate the main features of the Sloan 
Digital Sky survey as closely as possible. We employ the survey 
geometry of the Sloan Digital Sky Survey data release 7 depicted 
in the right panel of figure [T] The mask has been calculated us- 
ing the MANGLE code provided by [Swanson et al.[ ( |2008 1 and has 
been stored on a HEALPIX map with 11,^^ = 4096 l |G6rski et af] 
[21305 ). Further, we assume that the radial selection function fol- 
lows from a standard Schechter luminosity function with standard 
r-band parameters (a = -1.05, M, - 51og[o(/i) = -20.44 ), and we 
only include galaxies within an apparent Petrosian r-band magni- 
tude range 14.5 < r < 17.77 and within the absolute magnitude 
ranges M^,,, = -17 to M,„„v = -23. The corresponding radial se- 
lection function f(z) is then obtained by integrating the Schechter 
luminosity function over the range in absolute magnitude: 



fiz) 



i 



'M„,i„(z) 



cD(M) dM 



X,'""cD(M)dM 



(27) 



where 0(M) is given in appendix [C] The resulting selection func- 
tion for the simulated galaxy sample is depicted in the left panel 
of figure [T| The survey response operator Rj, required to simulate 
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Figure 4. Correlation length for different signal to noise values ^fN, as indicated in the legend. 



the Poisson process, can be calculated as the product of the sur- 
vey geometry and the selection function at each point in the three 
dimensional volume: 



^(Pi(k))=l. 



PM 
P°(k) 



(29) 



R, = M,F, = M(a„S,)f'{z,), 



(28) 



Finally, we choose A' = 9.93, and perform the Poisson sampling on 
the grid resulting in a total number of galaxies N,a, = 484227. 



7 TESTING 

In this section, we describe the application of our method to the 
artificial data set described in section [6] The primary intention of 
the following tests is to evaluate the performance of our method in 
realistic situations, taking into account observational systematics 
and uncertainties. 



7.1 Testing convergence and correlations 

The Metropolis Hastings Sampler in general and the HMC in par- 
ticular are designed to have the target distribution, in our case the 
large scale structure posterior distribution, as its stationary distri- 
bution (see e.g. |Metropohs et al.|195'3||Hastings|T970{|Neal|1993| >. 
For this reason, the sampling process will provide us with samples 
from the specified large scale structure posterior distribution after 
an initial bum-in phase. The length of this initial burn-in phase has 
to be assessed using numerical experiments. 

Generally, bum-in manifests itself as a systematic drift of the 
sampled parameters towards the true parameters from which the ar- 
tificial data set was generated. This behavior can be monitored by 
following the evolution of parameters in subsequent samples (see 
e.g. |Eriksenet a l. 2004 , J asche et al.|2010| l. To test this initial burn- 
in behavior we will arbitrarily reduce the power of the random ini- 
tial density field S] by a factor of 0.1, and observe the drift towards 
the true underlying values by following successive power-spectra 
measured from the samples. In figure [2] successive power-spectra 
of the first 800 samples are presented. The plot nicely demonstrates 
the systematic drift towards the true underlying solution by succes- 
sive restoration of the true power in the initial density field. 

More specifically, we can quantify the initial burn-in behav- 
ior by comparing the ith power-spectrum measurement Pj(k) in the 
chain to its true underlying value P'^(k) via: 



The results for this exercise are presented in the lower panel of fig- 
ure [2] It can be nicely seen that the algorithm restores the correct 
power an all scales and drifts towards preferred regions in parame- 
ter space to commence exploration of the large scale structure pos- 
terior. It is also important to remark that in this test, we do not ob- 
serve any particular hysteresis for the poorly constrained large scale 
modes, meaning they do not remain at their initially set values but 
efficiently explore the parameter space. This already indicates the 
ability of our algorithm to account for artificial mode coupling as 
introduced by the survey geometry. 

Burn-in also manifests itself in the initial acceptance rate as 
shown in the left panel of figurels] The dip in the initial acceptance 
rate function corresponds to the point when the sampling algorithm 
restored the full power-of the initial density field. This has a simple 
explanation. Initially, since the power was reduced by a factor of 
ten, the system behaved more or less linear since the displacement 
of 2LPT particles is small. Once the correct power is restored the 
displacement of particles increases, leading to a higher non-locality 
of the system. When the dip in the acceptance rate occurs, the sam- 
pler starts exploring physically more reasonable states in the initial 
conditions which can explain the observations. This process is ac- 
companied by an initially lower acceptance rate. Once the sampler 
moves to a reasonable region in parameter space the acceptance rate 
approaches asymptotically a rate of about 84 percent. This high ac- 
ceptance rate, combined with the fast de-correlation properties, we 
will demonstrated next, shows that our sampler makes sampling 
from this multi-million dimensional, non-linear posterior numeri- 
cally feasible. 

In particular, these tests show that the algorithm requires an 
initial bum-in phase of on the order of 600 samples before provid- 
ing samples from the target distribution. Also note that the initial 
burn-in period can be shortened by initializing the algorithm with 
an initial density field which is closer to the true solution. 

Another important issue to study when dealing with Markov 
Chain Monte Carlo methods is the mixing efficiency of the algo- 
rithm. Generally, successive samples in the chain will not be in- 
dependent but correlated with previous samples. Consequently, the 
correlation between successive samples determines the amount of 
independent samples which can be drawn from the chain. We study 
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Figure 5. Three slices through a sample of the initial density field (top panels), the final density field (middle panels) and through the corresponding data cube 
represented by the galaxy number counts (lower panels). The plots nicely demonstrate the correlation between the final density field and the data. To some 
extent, one can observe the connection between large structures in the initial conditions and the final density field. 



this effect by following a similar approach as described in l |Eriksen| 
let al.|2004| > or ( |Jasche et al.|2010y . 

Assuming all parameters in the the Markov chain to be in- 
dependent of one another one can estimate the correlation between 
subsequent density samples by calculating the autocorrelation func- 
tion: 



C(d)„ 



S'-{S) S'*"-{6} 



VVar (5) VVar((5) 



(30) 



where n is the distance in the chain measured in iterations (also see 
e.g.|Eriksen et al.|2004||Jasche et al.|2010[ for a similar discussion). 



The results for this analysis are presented in figure El where we 
plot the correlation coefficients for individual density amplitudes 
selected by their signal to noise ratio. It can be generally seen that 
the mixing efficiency is a little lower in regions with low signal- 
to-noise but the mixing eificiency of the algorithm is very good 
overall. 

We can further define a correlation length of the Markov sam- 
pler as the distance in the chain n^ beyond which the correlation 
coeificient C(6)„ has dropped below a threshold of C'''{S)„ =0.1. 
As can be seen in figureHlthe correlation length is about 200 sam- 
ples, demonstrating the high mixing efficiency of the algorithm. 
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Figure 6. One-point distributions for tlie density contrast in tlie initial field (left panel) and for the final field (right panel) measured from the samples. It can 
be seen that, while the inferred initial density field follows a Gaussian distribution, the final field exhibits the highly skewed log-normal like behavior 



Despite the high dimensionality of the problem considered here, 
these tests demonstrate that exploring large scale structure poste- 
rior for the initial conditions from observations exhibiting system- 
atic uncertainties and uncertainties is numerically feasible with our 
method. 



7.2 Large Scale Structure inference 

In this section we will discuss the results obtained from the applica- 
tion of the algorithm to the artificial data set, as described in section 
pi The initial and final density fields have been inferred on a 128^^ 
cubic Cartesian box with side length of 750 Mpc//?, resulting in a 
grid resolution of about ~ 6Mpc//j. While sampling the algorithm 
will provide matter field realizations for the initial and final 2LPT 
density fields, generated conditionally on the observed data. 

In figure Is] we show slices from three different sides through 
samples of the initial and corresponding final densities as well as 
through the data. It is immediately visible that the statistics of the 
initial and final matter fields differ greatly. While the initial density 
field appears to be very Gaussian, the final density field is clearly 
non-Gaussian. This demonstrates how the physical 2LPT model for 
structure formation is able to account for the growing statistical 
complexity of the density distribution during the evolution from 
the initial to the final state. Furthermore, comparison of the final 
density field (middle panels of figurelSj to the data (lower panels of 
figure [Sl demonstrates the recovered structures from the data. One 
can nicely see that the algorithm tries to extrapolate unobserved 
filaments between clusters based on the physically reasonable as- 
sumptions provided by the 2LPT model. In general, the algorithm 
nicely recovers the filamentary structure of the matter distribution. 

Figure B] illustrates that the algorithm accurately accounts for 
survey geometry and selection effects by augmenting unobserved 
or poorly observed regions with statistically correct information. 
The inferred initial and final density fields possess equal power 
throughout their entire domains and are not affected by selection 
or mask artifacts. Individual density samples can be understood as 
physical, three-dimensional matter field realizations, at least to the 



degree permitted by the 2LPT model. It is particularly interesting 
that unobserved and observed regions in the inferred final density 
fields do not appear visually distinct, a consequence of the excel- 
lent approximation of the 2LPT not just to the first but also higher 
order moments ( jMoutarde et al.|19 9r]|Buchert et al.|1994|| Bouchet] 
let al.|I995|[Scoccimarro|2000l |Scoccimarro & Sheth 2002). This 
is a great advantage over previous methods based on Gaussian or 
log-normal models where the assumption of a cosmological power- 
spectrum only specifies the two-point statistics correctly. In par- 
ticular, the reader may want to compare with figure 2 in |Jasche] 
|et al.| ( |20T0^ , where unobserved regions are augmented with a log- 
normal model unable to represent filamentary structures. 

In figure [6] we compare the one-point distribution of the in- 
ferred initial and final density field measured from the correspond- 
ing samples. It can be seen that while the initial density contrast 
follows Gaussian statistics, the final distribution is highly skewed 
and represents the expected log-normal like behavior. These results 
therefore supports our initial claim that the complex problem of 
modeling a prior distribution for the present fully non-linear density 
field can be exchanged for an initial conditions inference problem 
when assuming a physical model which accounts for the increas- 
ing statistical complexity of the matter distribution during structure 
formation. 

Further, we estimate the accuracy of the recovered density 
field by estimating the correlation coefficient r(x) between density 
samples and the true underlying solution as a function of some pa- 
rameter X. The correlation coefficient is given as: 



r(k,) ■ 



(6U6r) 



^|(is^)^|(m 



(31) 



'■f> 



where we will choose x to be the signal to noise ratio VA' for the 
final density field and a specific smoothing scale k,i, for the initial 
density field. The results for these tests are demonstrated in fig- 
ure [7] The left panel of figure [7] depicts the correlation between 
the true underlying final density field and the final density samples 
as a function of the signal to noise ratio. It can be seen that the 
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Figure 7. Cross correlation coefficient between the true final density field and a sample as a function of signal to noise (left panel) and the cross corelation 
between the true underlying initial density field and the infeiTed ensemble mean initial field as a function of smoothing scale k,i, (right panel). It is interesting 
to remark, that the correlation between true underlying and samples of the density field still amounts to about 55 per cent in regions where only a single galaxy 
has been observed. 



correlation with the truth is generally higher for higher signal to 
noise ratios. Even in zones that contain just a single galaxy we still 
get a correlation of about 55 percent. It is also remarkable that the 
algorithm still provides a 10 percent correlation with the true un- 
derlying density field in regions which have not been sampled by 
galaxies such as centers of voids or masked regions. The right panel 
of figure|7]demonstrates the cross correlation between the true un- 
derlying initial conditions and the inferred ensemble mean initial 
field as a function of filter scale k,i, when smoothed with a spheri- 
cal top hat filter in Fourier space. These results clearly demonstrate 
that the large scales of the initial conditions can be much easier re- 
covered than the small scale features. This is in agreement with the 
expectation, since the largest scales behave more linearly than the 
smaller scales and hence are easier to recover. Particular the shot 
noise contribution at the smallest scales in the final galaxy observa- 
tion will smear out features in the initial conditions, since the 2LPT 
displacement vector for the particles will fluctuate on these scales. 



7.2.1 Dynamics 

Importantly, the algorithm provides dynamical information on the 
large scale structure given the 2LPT model. In figure [8] we show 
the comparison between the true underlying velocity field, and the 
velocity field inferred by an randomly selected sample. It can be 
seen that the algorithm is able to recover the true underlying ve- 
locity field in detail. Our inference is clearly able to identify the 
true underlying velocity field in noisy or even completely masked 
regions. This clearly demonstrates the strength of this approach in 
extrapolating physically reasonable states of the matter distribution 
even into poorly observed regions. 



8 DISCUSSION AND CONCLUSION 

We describe a new method to perform dynamical large scale struc- 
ture inference from galaxy redshift surveys employing a second or- 
der Lagrangian perturbation model. In section [2] we demonstrated 
that the problem of constructing suitable prior distributions for the 
non-linear density field is directly linked to the problem of inferring 
initial conditions, once a dynamical model for large scale structure 



formation is given. In this approach the evolved non-linear density 
field acts as a mere nuisance parameter in the inference process, 
which shifts the problem of designing prior distributions to physi- 
cal modeling of the matter evolution dynamics. 

Since the method we propose provides an approximation to 
the non-linear dynamics the algorithm automatically provides in- 
formation on the dynamical evolution of the large scale matter dis- 
tribution. By exploring the space of dynamical histories compat- 
ible with both data and model our approach therefore marks the 
passage from Bayesian three-dimensional density inference to full 
four-dimensional state inference. 

Particularly, in this work we have employed a 2LPT model 
as an approximate dynamical description of the large scale struc- 
ture evolution on the large scales. As described in the literature, 
the 2LPT model describes the one, two and three-point statistics 
correctly and represents higher order statistics very well (see e.g. 
Moutarde et al. 1991, Buchert et al. 1994, Bouchet et al.|| 19951 
iScoccimarro,|2000, |Scoccimarro & Sheth||2002^ . Hence, the al- 
gorithm proposed in this work can exploit higher order statistics, 
modeled through the 2LPT model, to provide physically reason- 
able matter field realizations conditional on the observed galaxy 
distribution. 

It is also important to remark that the inference process de- 
scribed in section [2] requires at no point the inversion of the flow 
of time in the dynamical model. The inference process therefore 
solely depends on forward propagation of the model, which con- 
sequently alleviates many of the problems encountered in previous 
approaches to the reconstruction of initial conditions, such as spu- 
rious decaying mode amplification. Rather than inferring the initial 
conditions by backward integration in time our approach builds a 
non-linear filter using the dynamical forward model as a prior. This 
prior singles out physically reasonable large scale structure states 
from the space of all possible solutions. 

The resultant inference procedure is numerically highly non- 
trivial, since the large scale structure posterior distribution has to be 
evaluated in very high dimensional space. Typically we are dealing 
with 10* to 10^ parameters, corresponding to the voxels used to dis- 
cretize the domain. In section [3] we described an efficient Hybrid 
Monte Carlo implementation for the large scale structure inference 
problem when employing a dynamical model for large scale struc- 
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Figure 8. Three slices through the true underlying density field from three different sides over plotted with the two dimensional projection of the true velocity 
(left panels) and a sample velocity field (right panels). It can be seen that the algorithm is able to infer the true underlying dynamics of the system to great 
detail in noisy and even unobserved regions, when compared to the con'esponding data panels in figurels] 
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ture formation. Further, we discussed some details of the numerical 
implementation in sectionlS] 

To provide a proof of concept we test the algorithm in an ar- 
tificial scenario, based on the characteristics of the Sloan Digital 
Sky Survey Data Release 7. In particular, as described in section 
[6] we use the SDSS DR7 completeness map and realistic selection 
functions based on the Schechter luminosity function to generate a 
realistic testing environment essentially emulating the SDSS DR7 
main sample. 

The major aim of testing the algorithm, described in section 
[7] was to estimate the method's performance in a realistic scenario. 
An important issue to test when dealing with Markov Chain Monte 
Carlo methods is the question of how many independent samples 
can be drawn from the chain. The high efficiency of our Hybrid 
Monte Carlo scheme permits to explore the posterior distribution 
with a typical acceptance rate of about 84 per cent while main- 
taining the correlation length of the chain at or below 300 steps. 
We estimate the length of the burn-in phase to be about 600 steps. 
In summary, our tests reveal that the proposed analysis approach 
is not only within numerical reach but is efficient enough to work 
well with present day computational resources. 

The properties of the inferred large scale structure fields were 
studied in section |7.2| It is clear upon visual inspection that our 
approach returns far more physical reconstructions than previous 
methods based solely on two-point information (see e.g. 
l994l|Zaroubi|2002||Erd ogdu et al.|2004| Kitaura & EnBlin|: 



2ora 



Kitaura et aLp 009 2010| |Jasche et al.||2010| |Jasche & Kitaura] 



;. |Lahav| 
in|2008| 



Jasche et al.||2010|?). This is particularly obvious for un- 



observed regions which are augmented with statistically correct 
information, in order to account for survey geometry and cosmic 
variance. In the present approach augmented regions are visually 
indistinguishable from regions containing data. Therefore, the in- 
dividual matter field samples can be regarded as full physical mat- 
ter field realizations conditional on the observations, at least to the 
degree represented by the 2LPT model. 

By studying the one-point distributions of the inferred initial 
and final density fields we demonstrated that the algorithm cor- 
rectly recovers the Gaussian initial conditions from a galaxy obser- 
vation which does not exhibit Gaussian but highly skewed log nor- 
mal like statistics. This demonstrates that the algorithm correctly, 
accounts for the mode coupling and phase correlations originally 
introduced to the matter distribution by gravitational structure for- 
mation. In addition, it supports our initial claim that the approach 
of searching for phenomenological approximations to the full prob- 
ability distribution for the non-linear matter field can be efficiently 
reformulated as an initial condition problem once a physical model 
for large scale structure is employed. 

To estimate the accuracy of recovered density fields, we stud- 
ied the correlation between the true underlying and samples of the 
final density field as a function of the signal to noise ratios. As ex- 
pected, the correlation can reach up to 90 per cent in the high signal 
to noise regime, where S /N ~ 3. In addition, the algorithm still pro- 
vides a correlation of about 50 per cent between the true underlying 
density field and the samples in regions where only a single galaxy 
has been observed. Also note that in regions where the signal to 
noise is zero, which are either centers of voids or unobserved re- 
gions, the algorithm still provides a 10 per cent correlation. This is 
a clear manifestation of improved inference due to the incorpora- 
tion of a physical model of large scale structure formation, which 
exploits additionally three point and higher moment statistics of the 
density distribution. These tests further demonstrate that the algo- 



rithm correctly accounts for systematics such as the survey geome- 
try and selection effects. 

Along with the inferred density fields the algorithm also pro- 
vides dynamical information on the large scale flows. By compar- 
ing the true underlying velocity field to the inferred velocity field 
of an arbitrary sample we demonstrated that the algorithm accu- 
rately recovers large scale flows, even in noisy or even unobserved 
regions. This clearly demonstrates the strength of the method in 
extrapolating physically reasonable states into poorly observed re- 
gions. 

The method we describe forms the basis for a sophisticated 
and extensible dynamical large scale structure inference frame- 
work. In future work we will demonstrate the application of the 
algorithm to a real galaxy survey accounting for additional sys- 
tematics such as luminosity or color dependent bias. Note that the 
algorithm as described in this work can be easily extended to ac- 
count for any kind of non-linear and non-local bias. In particular, 
the 2LPT model, as employed in ffiis work, can already be inter- 
preted as a non-local, non-linear bias model between the initial 
conditions and the galaxy observations. It would also be possible 
to incorporate a halo model based galaxy bias model in the fashion 
as described by jScoccimarro & Sheth| ( [2002^ . The combination of 
the algorithm described in this work and the photometric redshift 
sampling method proposed in [Jasche & Wandeit| ( |201l) , will lead 
to immediate improvements for ffie inferred photometric redshifts, 
since the combination of both algorithms will exploit higher or- 
der statistics, whereas the algorithm described in |Jasche & Wandeit] 
pOTT) is solely based on two-point statistics. In a similar fashion, 
dynamical velocity information provided by the 2LPT model can 
be used to correct for redshift uncertainties in spectroscopic sur- 
veys. 

Since the algorithm is fully Bayesian, it provides inferred ini- 
tial and final density fields and also the means of estimating their 
significance and uncertainties by a sampled representation of the 
initial conditions posterior distribution. The algorithm will there- 
fore provide accurate information on the initial conditions from 
which the observed large scale structure originates. These initial 
density fields may be useful for a variety of scientific projects such 
as constrained simulations (see e.g. iKravtsov et al. 2002; Klypin 



et al.||2003[ [Dolag et al.|[2005| pbeskind et al.,,2010, ,Martinez 



Vaquero et al.|2009||GottIoeber et al.|2010[|Lavaux|2010||Dol'ag 
et al.||2012^ . Sincethe 2LPT model reconstructs the initial tidal 
field it may also open up a new way to study the angular momen- 
tum build-up of galaxies through tidal torque theory (see e.g. the 
review by |Schafer|2009[ and references therein). 

In conclusion, we presented a new Bayesian dynamical large 
scale structure inference algorithm which will provide the commu- 
nity with accurate measurements of the three dimensional initial 
density field as well as estimates of the dynamical behavior of the 
large scale structure. 
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APPENDIX A: LINEAR STRUCTURE FORMATION 

In the linear regime structure formation is governed by a ho- 
mogeneous growth function D^{a) acting on the density contrast 
S(J(,a) = D*(a)6(x,a = 1). For a general cosmology the growth 
factor D^(a) can be obtained by numerical solution of the linear 
growth equation (see e.g. [Turner & White|[l997[ |Wang & Stein-| 
|hardt|1998||Linder & Jenkins|2003| l: 

<fD+(a) 1 / dln//\ dZ)+(fl) 
da^ a \ dlna I da 



3 n,„(a)D^{a) 

2 ^^2 



(Al) 



APPENDIX B: LAGRANGIAN PERTURBATION THEORY 

In the following we will give a brief summary of second or- 
der Lagrangian perturbation theory to the degree required for the 
present work. More detailed discussion of Lagrangian perturbation 
theory in general and its application can be found in the litera 
ture (see e.g. Moutarde et al. 1991; Buchert et al. 1994, Bouchet| 



et al.|1995,,ScoccimaiT0.1998,,2000,.Scoccimarro & Sheth.200^ 



; het| 
j02| 



Bemardeau et al.||2002[ ). Also see [Bemardeau et al.| P002^ for a 
general overview of Eulerian and Lagrangian cosmological pertur- 
bation theory. 

In an expanding Robertson Friedman space time the equations 
of motion for particles solely interacting through gravity are given 
as (see e.g. [Bemardeau et al.|2002||Scoccimarro|2000|): 



d^x ^ ,dx „ 



(Bl) 



where C) is the gravitational potential and V^- is the gradient with 
respect to the Eulerian coordinates x, "H = d In a/dr and the con- 
formal time T defined by dr = dt/a . In order to solve this set of 
equations, Lagrangian perturbation theory introduces the following 
Ansatz for a solution: 



x{t) = q + fiq, t) , 



(B2) 



where T(g, t) defines the mapping from the particles initial posi- 
tion q into its final Eulerian position x (see e.g. [Bemardeau et al.| 
[2002[ [Scoccimarro|[2000l >. Equation l [B2[ l together with equation 
( B 1 1 yields a non-linear equation for the displacement field 'f(q, t) 



which can be solved perturbatively by expanding around its linear 
solution ( [Bemardeau et al.]|2002^ . To linear order, this perturba- 
tive approach yields the famous Zel'dovich approximation given as 
(jZeT'Dovich 1 9701 [Doroshkevich|[T970l [Buchert[[T989l [Moutarde[ 
[et al.|1991||Bemardeau et al.|2002[ >: 



V,1'<"(g,a) = -D*(a)6{q,a = 1) 



(B3) 



Adding second order terms to the perturbative expansion remark- 
ably improves the results of the first order Zel'dovich approxima- 
tion. In particular, second order terms account for the fact that 
gravitational instability is non-local by introducing corrections due 
to gravitational tidal effects jBemardeau et al!][2002^ . The sec- 
ond order displacement field *P*'^*(^, a) is then defined by (see e.g. 
[Bemardeau et al.|2002"i[Scoccimarro & Sheth|2002 ): 



V,^^\4, a) = \D,(a) Y, i^ ^, - "P*:; ^^ 



(B4) 



with TI" = 8^''^^ Idqj and D^ifl) is the second order growth factor 



1.1 
given as: 



(B5) 



which holds for a flat model with non-zero cosmological constant 
A and for 0.01 < fl„, < 1 to better than 0.6 per cent accuracy 
(see e.g. jBouchet et al.|1993|[Scoccimarro|1998[[Bemardeau et al.[ 
[20021 for details). 

As has been previously shown, second order Lagrangian per- 
turbation theory recovers correctly the two- and three-point statis- 
tics at large scales and further approximates higher-order statistics 
very well jMoutarde et al.[1991[[Buchert et al.[1994[[Bouchet et al.[ 



[7995l[Scoccimarro|2000||Scoccimarro & Sheth|2002| >. Also note, 
that second order corrections to the Zel'dovich approximation are 
essential to accurately describe the departure of the large scale den- 
sity field from Gaussian initial conditions jScoccimarro & Sheth[ 
[20521 [Tatekawa & Mizuno|2007[|7enkins|2010| ). 

Lagrangian solutions up to second order are curl free, as they 
follow potential flows (see e.g. Buc hert et al. 11994) [Scoccimarro[ 
[1998; Bem ardeau et al.||2002^ . Therefore, it is convenient to in- 
troduce the Lagrangian potentials <I)*'* and O'^', such that the ap- 
proximate solution to equation (IbTI can be expressed as (see e.g. 



[Buchert et al.|1994[[Sc"occimarro|1998l[Bemardeau et al.|2002[ >: 

f(r) = 4- D*{a)y^ O'" + 02*'^' , (B6) 

where the time-independent potentials O"* and <I>*"' are solutions to 
the following Poisson equations ^Buchert et al.|1994l i: 

and 



(B7) 



v^cD<2)(^ = ;^[cD<;'(^o^](^-(o;,';(^y 



(B8) 



For an excellent guide to the numerical implementation of the 2LPT 
model the reader is referred to appendix D of[Scoccimarro[p"998^. 



APPENDIX C: THE SCHECHTER LUMINOSITY 
FUNCTION 

The Schechter luminosity function is given as ^Schechterj 1976[): 



t^+l _[q0.4{M*-W) 



dM. (CI) 



(l)(M)dM = 0.4O*ln(10) [\qPMm--m)J ^ 

Note that for the purpose of calculating selection functions the nor- 
malization O* is not required. 



APPENDIX D: HAMILTONIAN FORCES FOR THE 
LIKELIHOOD TERM 

In this section we will discuss the derivation of the Hamiltonian 
forces for the 21pt-Poissonian process. To prevent confusion be- 
tween the variables describing the physical 2LPT model and the 
variables describing the Hamiltonian inference framework we will 
re express the 2LPT model in the following form for the purpose of 
the derivations in this section: 



f„ = f„(5') = 4„ - £»' Kl(6-) + £>2 llliS') , 



(Dl) 



where KUS') and K^,(6') are the first and second order displace- 
ments fields, respectively. 

As described in section |4| the likelihood term of the Hamilto- 
nian potential is given as: 

"Vi.MihooAiS'j]) = Yj ^'^s'" ^1 + <^("' ^">'^ 
I 

-N,\n{R,N,,i{l+G(a,S-),)), (D2) 
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with G{a, s) given via tlie kernel estimate as: 
W{Xp(a, s) - xi) 



G(a, s)i = 2_] 



N 



1, 



(D3) 



and Xp{a, s) is described by equation | D1| and W(x) is a CIC kernel 
(see e.g. |Hockney & Eastvyood 1988, Jas che et al.|2009| . Further- 
more, the Lagrangian displacement vectors are given as: 



K = J]^'('^r-^j)'^'h 



(D4) 



where V'(x) is the gradient of the kernel W(x). With these defi- 
nitions we can write the Hamiltonian forces corresponding to the 
likelihood term as: 



d^Iikelilmoil({S'j]) 



= z 



1 



RiN(l+G(a,6')i) 



RiN 



dG(a,S-),) 



dSL 



(D5) 



The notation can be simplified by introduce the quantity A; as: 
1 



A, 



1- 



RiN{l+6As) 
We can then write: 

d^likeUlwod{{6'j]) 



RiN. 



(D6) 



dSL 



z-^. 



dG{a,S')i 
dSi„ 



Z^Z 



d6i.. 



A, Y^ dW(^„ - f,) 






1 dK;,(6-) 
dSL 



+D' 



dKliS-) 



06' 



(D7) 

where we made use of equations (JDlJ and l |D4^ . It can be seen that 
the Hamiltonian force is the sum of two vectors. In the following 
we will therefore discuss each term independently. The first term is 
exactly the Hamiltonian force expected from a pure Zeldovich ap- 
proximation without higher order correction terms. We will start by 
evaluating the Hamiltonian force for the Zeldovich approximation. 



d6L 



Z-U Aj V^ ->, J 



N 



d6- 



-£»' A, 



* 



ZZt'^'(^^"-^'^Z^'(^^"-^^^)^ 



- zzz 



-D' A ^ ^ *' 



N 



(D8) 



The notation can be further simplified by introducing: 



^^- = Z Z ^ ^'^^" - ^■^^'^'^" - ^''^ ■ 



We can then write 



O 



861 






(D9) 



(DIO) 



The Zeldovich Approximation potential was calculated using the 
Fast Fourier Transform approach, which can be written as: 



O] 



^ Jie'"*"^ Yj ^«e-2™*^ . (DID 

Using this expression in equation JD8l we yield: 

-^'Z^^Zi^^'^^'^Z'^i'^"^"^ 



3^L,,„o.,(is-p 



66' 



k K 

-1 

k k 



. _Di ^ p. ^ Zle2-^*^e-2""*^ 



^.^Zle---^2^^.e 



2>r./*V 



k K 



(D12) 



This result looks remarkably similar to equation (IdTsI and at first 
sight one might be inclined to straightforwardly solve this equa- 
tion with FFT techniques. However, it is important to note that the 
signs have changed in the exponents, and hence equation iJDSl can 
not directly be solved with FFTs. In Appendix IE] we show what 
procedures must be followed in order to apply FFTs to this prob- 
lem. To further simplify the notation in the following steps we will 
introduce the quantity 7,„, defined as: 



Z^' 



k '^k 



■■^Y.F.i^""'"^ 



(D13) 



With this definition the ZA term of the Hamiltonian force can be 
written as: 



m^) 



d6L 



-£>' /,„ 



(D14) 



Next, we will discuss the second order Lagrangian term in 
equation (|D7): 



d6L 



D^Ai 






86:.. 



O^ 



- ZZ^*'^^^"-^^'^Z^'(«^'-^^)a^ 



(DI5) 



The second order Lagrangian potential <b^ can be calculated as 



k '^k 



>.;*V 



z 

* 
with if)„ given as: 

0- = Z'^«>«''-('^"'1 






- V=T 



(D16) 



(D17) 



a>b 



where the individual potentials 0„ are related to the signal 6'„ via 



'^k'^k ^2m,k- 



k k 



'^J]6;e 



(D18) 



With these definitions, we can write 



d6L 



'''-Z^.Z^«=-""Zt= 



. *. 
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-1 



k k 

a>h 
a>b j k k II 



ds; 






fl>Z7 j 



dSL 



k k 



n laa o tab 



- <4f + — ^ C - 2(4' 



<9(5! 



dSL 



(D19) 



In the following we will discuss the individual terms. To simplify 
the notation, we introduce the tensor t"'"^'' defined as: 



-1 _2.U-^ 
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With these definitions the second order Lagrangian contribution to 
the Hamiltonian force can be calculated as 

#i„(.v) 



dSL 



^^Z(^ 



' r\ abab\ 



This finally yields the Hamiltonian forces corresponding to the 
likelihood term: 
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(D22) 



APPENDIX E: AD JOINT EFT 

The following operation can be performed via FFT methods, when 
accounting for adjoining the operation: 
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where we made use of the fact that Gj is a real quantity, and the 
* denotes complex conjugation. Therefore, equation ( |El[ l simply 
describes the application of an FFT followed by a complex conju- 
gation. To solve the adjoint Poisson equation we calculate: 
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where in the last step we made use of the periodicity of the signal. 



APPENDIX E: HAMILTONIAN MASSES 

A good guess for the Hamiltonian masses can greatly improve the 
efficiency of the hybrid Hamiltonian sampler. In order to derive ap- 
propriate Hamiltonian masses for the 2LPT-Poissonian system we 
will follow a similar approach as described in ' Taylor et al.| l (2008[ > 
and I Jasche & Kita ura (2010). Since the efficiency of the Hamilto- 
nian sampler depends on the accuracy of the leapfrog scheme, we 
will perform an approximated stability analysis of the integrator. 
The goal of this analysis is to find an expression for the Hamilto- 
nian masses which optimizes the stability of the integration scheme 
for the 2LPT-Poissonian system. 

According to the leapfrog scheme, given in equations \23\ , 
l |24| ( and l |25| ), a single application of the leapfrog method can be 
written in the form: 



p,„(t + e) = Pm(t)- 



' &V{5') 



d5\ 



5T((5') 



■S'W 



35' 



S'(He)) 



^D2 1 ) ,,,_, (t + e) = s,„ (0 + f Xi ^'"i PJ^'^ ~ T Z 



M- 



d5\ 



(Fl) 



(F2) 



s'm 



We will then expand the Hamiltonian forces given in equation (fTSl 
around a fixed value (5')JJ,, which is assumed to be the mean sig- 
nal around which the sampler will oscillate once it left the Bum-in 
phase. Further, we will only expand up to linear order in the forces, 
which amounts to second order in the potential and hence to a Gaus- 
sian approximation of the 2LPT-Poissonian posterior distribution. 
For simplicity we will also ignore the second order Lagrangian term 
in the forces. Thus, the Hamiltonian forces can be written as: 

d6;„ " d6-„, "*" d5>,„ 

j 
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We will simplify the notation by introducing the matrix: 
dJj(6-) 



A.,u = s-y-di,D 



dd': 



S'M6')" 



and the vector: 
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Equation ( |F3^ can then be written as: 
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Introducing this approximation into equations (jFT) and l|F2l yields: 
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leapfrog step. First we have to ensure that the first term of equation 
(jFTTll does not diverge. This can be fulfilled if the eigenvalues of 
T have unit modulus. The eigenvalues A are found by solving the 
characteristic equation: 



det 



\A'-2A\l- — AM"'|+I 



0. 



(F12) 



Note that this is a similar result to what was found in [Taylor et al.| 
,p^JJ2008l. Our aim is to explore the parameter space rapidly, and 
therefore we wish to choose the largest e still compatible with the 
stability criterion. However, any dependence of equation l |F12[ l also 
implies that no single value of e will ensure unit modulus for every 
eigenvalue. For this reason we choose: 



A = M. 



We then obtain the characteristic equation: 
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where A' is the number of voxels. This yields the eigenvalues: 

^2 1 
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which have unit modulus for 6 < 2. The second term in equation 
iFll I involves evaluation of the geometric series YH^o T'- The ge- 
ometric series for a matrix converges if and only if |/i,| < 1 for 
each Aj eigenvalue of T. This clarifies that the nonlinearities in the 
Hamiltonian equations generally do not allow for arbitrary large 
pseudo time steps 6. In addition, for practical purposes we usually 
restrict the mass matrix to the diagonal of equation l|F4l. In prac- 
tice we choose the pseudo time step e as large as possible while still 
obtaining a reasonable rejection rate. 

Given these assumptions we can assume the mass matrix to 
be: 
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This result can be rewritten in matrix notation as: 
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where the matrix T is given as: 

^m-'a] [i- !^am-'] ^ 

with I being the identity matrix. Successive applications of the 
leapfrog step yield the following propagation equation: 
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This equation demonstrates that there are two criteria to be ful- 
filled if the method is to be stable under repeated application of the 
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where we used of equations l|D9[> and (|D13^. According to equation 



(D6k -T- can be expressed as: 
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where we introduced the quantity fi, = 

[RiN'j/(^RiN{l+G{a,6')i)) to simplify notation. We then 
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arrive at the expression: 
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